Support Vector Machines for Risk Stratification of Childhood Leukaemia
نویسندگان
چکیده
This document describes the methods that were used when implementing support vector machines for classification and stratification of children with acute lymphoblastic leukaemia (ALL), and the underlying variable interaction used by these machines during training. ALL stands for a third of all patients with cancer and is demonstrated with an overpopulation of immature Band T-lymphocytes (components of the white blood cells). In order to give the right treatment, researchers use clinical information of great importance when stratifying patients (prognostic factors). However, with the advance of research new factors emerge and following this, new treatment is developed. The investigated dichotomous classifier could make prognosis of whether patients were going to die within a five year period with a rather high performance. Further, investigating underlying patterns with structure detection, we were able to find interrelationship between the factors. Parameter search and backward feature extraction resulted in significant improvement of the performance during training whilst principal component analysis in SPSS application program did not influence the performance. Finally, a multi-class support vector machine stratified patients into five risk groups and managed to reclassify four patients into different groups than their original classification. The procedure of implementation, methods and result are described in detail in the different sections of this document. Supportvektormaskiner för riskstratifiering av barnleukemi Sammanfattning Den här rapporten redovisar de metoder som använts vid konstruktionen av supportvektormaskiner för klassificering och stratifiering av barn med akut lymfatisk leukemi (ALL), såväl som den underliggande variabelinteraktionen som klassificeraren använde vid träning. ALL står för en tredje del av alla patienter med barncancer och påvisas genom en ökning av omogna Boch T-lymfocyter (komponenter av de vita blodkropparna). För att varje patient ska få rätt behandling använder forskare klinisk information som är av stor betydelse för stratifiering av patienter (prognostiska faktorer). Denna stratifiering förfinas hela tiden och leder till att behandlingen av patienterna gradvis förbättras. I det här arbetet, kunde den dikotomiska klassificeraren prognostisera vilka patienter som var döda respektive levande efter fem år av diagnosen med hög prestanda. Vidare, i letandet av underliggande mönster med strukturdetektering, kunde vi hitta inbördes samband mellan de prognostiska faktorerna. Parametersökning och variabelextrahering vid träning av nätet ledde till en betydande förbättring av prestandan medan principalkomponentanalys i applikationsprogrammet SPSS, inte påverkade klassificeringen. Slutligen, en supportvektormaskin för multi-klassproblem stratifierade patienter i fem olika riskgrupper och kunde omklassificera fyra av dem till annan riskgrupp än deras tilldelade. Implementering, metoder och resultat redovisas utförligt i de olika avsnitten av detta dokument.
منابع مشابه
Separating Well Log Data to Train Support Vector Machines for Lithology Prediction in a Heterogeneous Carbonate Reservoir
The prediction of lithology is necessary in all areas of petroleum engineering. This means that to design a project in any branch of petroleum engineering, the lithology must be well known. Support vector machines (SVM’s) use an analytical approach to classification based on statistical learning theory, the principles of structural risk minimization, and empirical risk minimization. In this res...
متن کاملA Comparative Study of Extreme Learning Machines and Support Vector Machines in Prediction of Sediment Transport in Open Channels
The limiting velocity in open channels to prevent long-term sedimentation is predicted in this paper using a powerful soft computing technique known as Extreme Learning Machines (ELM). The ELM is a single Layer Feed-forward Neural Network (SLFNN) with a high level of training speed. The dimensionless parameter of limiting velocity which is known as the densimetric Froude number (Fr) is predicte...
متن کاملSTAGE-DISCHARGE MODELING USING SUPPORT VECTOR MACHINES
Establishment of rating curves are often required by the hydrologists for flow estimates in the streams, rivers etc. Measurement of discharge in a river is a time-consuming, expensive, and difficult process and the conventional approach of regression analysis of stage-discharge relation does not provide encouraging results especially during the floods. P
متن کاملGene variants of CYP1A1 and CYP2D6 and the risk of childhood acute lymphoblastic leukaemia; outcome of a case control study from Kashmir, India
Studies on associations of various polymorphisms in xenobiotic metabolizing genes with different cancers including acute lymphoblastic leukaemia (ALL) are mixed and inconclusive. The current study analyzed the relationship between polymorphisms of phase I xenobiotic metabolizing enzymes, cytochromes P450 1A1 (CYP1A1) and CYP2D6 and childhood ALL in Kashmir, India. We recruited 200 confirmed ALL...
متن کاملHousehold exposure to pesticides and risk of childhood acute leukaemia.
OBJECTIVES To investigate the relation between childhood acute leukaemia and household exposure to pesticides. METHODS The study included 280 incident cases of acute leukaemia and 288 controls frequency matched on gender, age, hospital, and ethnic origin. The data were obtained from standardised face to face interviews of the mothers with detailed questions on parental occupational history, h...
متن کامل